Hands-on Exercise 2

Published

January 13, 2024

Content & Theory

Understanding the difference between classified and unclassified chloropleth maps:

  • Classified Choropleth Map:
    • Use Case: Classified choropleth maps are more common and appropriate when you want to highlight patterns, trends, or variations in the data across different classes or ranges.
    • Data Characteristics: If your data naturally falls into distinct categories or classes, such as income brackets, population density ranges, or temperature zones, a classified choropleth map is suitable.
    • Communication of Patterns: This type of map is effective in communicating the relative differences and relationships between geographic regions within each class.
    • Example: Showing different income levels across regions, where each class represents a specific income range, and different colors represent different classes.
  • Unclassified Choropleth Map:
    • Use Case: Unclassified choropleth maps are used when the emphasis is on the overall distribution of the data rather than specific classes. It is also known as a continuous or graduated choropleth map.
    • Data Characteristics: If your data is more continuous and doesn’t naturally fall into distinct classes, such as temperature gradients or precipitation levels, an unclassified choropleth map is more appropriate.
    • Communication of Gradations: This type of map is suitable when you want to communicate the gradual change or intensity of a variable across regions without categorizing them into specific classes.
    • Example: Showing a gradient of population density across regions without predefined categories; the map could represent a smooth transition from low to high density.

Understanding how data distribution affects choosing classes for a choropleth map:

The of the distribution influences the effectiveness of each classification method. Understanding whether your data is skewed, evenly distributed, has natural breaks, or follows a particular pattern helps you choose the most appropriate classification method for creating a choropleth map that accurately represents the underlying spatial patterns in your data. Choosing the right classification method enhances the map’s interpretability and ensures that the representation aligns with the characteristics of the data you are visualizing.

  • Quantiles:
    • Importance of Distribution: Quantiles divide the data into equal-sized groups, which is especially useful when dealing with data that may have skewed distributions or outliers. Quantiles are less sensitive to extreme values.
    • Use Case: Effective for data with varying levels of intensity, where you want to ensure that each category represents an equal proportion of the total observations.
  • Equal Interval:
    • Importance of Distribution: Equal interval classification divides the range of values into equal intervals. This method is suitable when the data exhibits a fairly uniform distribution.
    • Use Case: Works well for data with a linear distribution, where equal intervals provide a simple and easy-to-understand representation.
  • Natural Breaks (Jenks):
    • Importance of Distribution: Natural breaks aim to identify natural groupings or clusters in the data. It is useful when the data has distinct breaks or modes.
    • Use Case: Effective for data with clear breaks or patterns, helping to emphasize differences between groups.
  • Standard Deviation:
    • Importance of Distribution: Standard deviation classification is suitable for data with a normal distribution. It places values into classes based on the standard deviation from the mean.
    • Use Case: Appropriate for normally distributed data, where you want to highlight variations from the average.
  • Defined Interval:
    • Importance of Distribution: Defined interval classification allows you to specify custom class intervals. It is flexible and can be adapted to the specific characteristics of your data.
    • Use Case: Useful when you have prior knowledge or specific criteria for defining meaningful intervals in your data.

An overview of all the mapping packages in R

  1. tmap:

    • Purpose: tmap is a package for creating thematic maps in R. It provides a framework for easily creating static and interactive maps with a focus on simplicity and flexibility. It allows users to create map visualizations with various map types, legends, and thematic mapping techniques.
  2. mapsf:

    • Purpose: mapsf is another R package for creating thematic maps. It builds on the simplicity of the sf package for handling spatial data and aims to provide an easy-to-use interface for producing high-quality maps.
  3. leaflet:

    • Purpose: leaflet is an R package that interfaces with the JavaScript library Leaflet for creating interactive maps. It is particularly useful for web-based mapping applications and provides a simple way to add interactive features like zooming, panning, and pop-ups.
  4. ggplot2:

    • Purpose: ggplot2 is a powerful and versatile package for creating static graphics in R. While not specifically designed for maps, it can be used for creating static choropleth maps and other types of visualizations. When combined with spatial data, it allows for sophisticated data visualization.
  5. ggmap:

    • Purpose: ggmap is an extension of ggplot2 that facilitates the integration of static maps from Google Maps, OpenStreetMap, or other mapping providers into your ggplot2 visualizations. It is useful for creating data visualizations that incorporate a background map.
  6. quickmapr:

    • Purpose: quickmapr is a package designed for the quick and easy visualization of spatial data, including shapefiles and raster data. It provides functions to load, plot, and analyze spatial data in a straightforward manner.
  7. mapview:

    • Purpose: mapview is an R package that provides an interactive viewing environment for spatial data. It allows users to explore spatial datasets with an interactive map viewer, making it easier to inspect and analyze the data.
  8. RColorBrewer:

    • Purpose: RColorBrewer is not a mapping package per se, but it is a package that provides color palettes suitable for thematic mapping. It is often used in conjunction with other mapping packages to enhance the visual appeal and interpretability of maps.
  9. classInt:

    • Purpose: classInt is a package that provides functions for choosing class intervals for choropleth maps. It offers different methods for classifying numerical data into meaningful intervals, such as quantiles, equal intervals, and more.

These packages cover a range of functionalities, from creating static and interactive maps to handling spatial data, adding background maps, and choosing appropriate color schemes for thematic mapping. The choice of a specific package often depends on the specific requirements of your mapping task and the features you need in your visualization.

Main functions we will be using

  • tmap
  • readr
  • tidyr
  • dplyr
  • sf
pacman::p_load(sf,tidyverse,tmap)

Importing the data

Geospatial data: Shapefile or KML file

mpz <- st_read(dsn = "../../data/Week2/geospatial",
               layer = "MP14_SUBZONE_WEB_PL")
Reading layer `MP14_SUBZONE_WEB_PL' from data source 
  `C:\kllygh\IS415-GAA\data\Week2\geospatial' using driver `ESRI Shapefile'
Simple feature collection with 323 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21
mpz
Simple feature collection with 323 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 2667.538 ymin: 15748.72 xmax: 56396.44 ymax: 50256.33
Projected CRS: SVY21
First 10 features:
   OBJECTID SUBZONE_NO       SUBZONE_N SUBZONE_C CA_IND      PLN_AREA_N
1         1          1    MARINA SOUTH    MSSZ01      Y    MARINA SOUTH
2         2          1    PEARL'S HILL    OTSZ01      Y          OUTRAM
3         3          3       BOAT QUAY    SRSZ03      Y SINGAPORE RIVER
4         4          8  HENDERSON HILL    BMSZ08      N     BUKIT MERAH
5         5          3         REDHILL    BMSZ03      N     BUKIT MERAH
6         6          7  ALEXANDRA HILL    BMSZ07      N     BUKIT MERAH
7         7          9   BUKIT HO SWEE    BMSZ09      N     BUKIT MERAH
8         8          2     CLARKE QUAY    SRSZ02      Y SINGAPORE RIVER
9         9         13 PASIR PANJANG 1    QTSZ13      N      QUEENSTOWN
10       10          7       QUEENSWAY    QTSZ07      N      QUEENSTOWN
   PLN_AREA_C       REGION_N REGION_C          INC_CRC FMEL_UPD_D   X_ADDR
1          MS CENTRAL REGION       CR 5ED7EB253F99252E 2014-12-05 31595.84
2          OT CENTRAL REGION       CR 8C7149B9EB32EEFC 2014-12-05 28679.06
3          SR CENTRAL REGION       CR C35FEFF02B13E0E5 2014-12-05 29654.96
4          BM CENTRAL REGION       CR 3775D82C5DDBEFBD 2014-12-05 26782.83
5          BM CENTRAL REGION       CR 85D9ABEF0A40678F 2014-12-05 26201.96
6          BM CENTRAL REGION       CR 9D286521EF5E3B59 2014-12-05 25358.82
7          BM CENTRAL REGION       CR 7839A8577144EFE2 2014-12-05 27680.06
8          SR CENTRAL REGION       CR 48661DC0FBA09F7A 2014-12-05 29253.21
9          QT CENTRAL REGION       CR 1F721290C421BFAB 2014-12-05 22077.34
10         QT CENTRAL REGION       CR 3580D2AFFBEE914C 2014-12-05 24168.31
     Y_ADDR SHAPE_Leng SHAPE_Area                       geometry
1  29220.19   5267.381  1630379.3 MULTIPOLYGON (((31495.56 30...
2  29782.05   3506.107   559816.2 MULTIPOLYGON (((29092.28 30...
3  29974.66   1740.926   160807.5 MULTIPOLYGON (((29932.33 29...
4  29933.77   3313.625   595428.9 MULTIPOLYGON (((27131.28 30...
5  30005.70   2825.594   387429.4 MULTIPOLYGON (((26451.03 30...
6  29991.38   4428.913  1030378.8 MULTIPOLYGON (((25899.7 297...
7  30230.86   3275.312   551732.0 MULTIPOLYGON (((27746.95 30...
8  30222.86   2208.619   290184.7 MULTIPOLYGON (((29351.26 29...
9  29893.78   6571.323  1084792.3 MULTIPOLYGON (((20996.49 30...
10 30104.18   3454.239   631644.3 MULTIPOLYGON (((24472.11 29...
head(mpz,n=5)
Simple feature collection with 5 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 25867.68 ymin: 28369.47 xmax: 32362.39 ymax: 30435.54
Projected CRS: SVY21
  OBJECTID SUBZONE_NO      SUBZONE_N SUBZONE_C CA_IND      PLN_AREA_N
1        1          1   MARINA SOUTH    MSSZ01      Y    MARINA SOUTH
2        2          1   PEARL'S HILL    OTSZ01      Y          OUTRAM
3        3          3      BOAT QUAY    SRSZ03      Y SINGAPORE RIVER
4        4          8 HENDERSON HILL    BMSZ08      N     BUKIT MERAH
5        5          3        REDHILL    BMSZ03      N     BUKIT MERAH
  PLN_AREA_C       REGION_N REGION_C          INC_CRC FMEL_UPD_D   X_ADDR
1         MS CENTRAL REGION       CR 5ED7EB253F99252E 2014-12-05 31595.84
2         OT CENTRAL REGION       CR 8C7149B9EB32EEFC 2014-12-05 28679.06
3         SR CENTRAL REGION       CR C35FEFF02B13E0E5 2014-12-05 29654.96
4         BM CENTRAL REGION       CR 3775D82C5DDBEFBD 2014-12-05 26782.83
5         BM CENTRAL REGION       CR 85D9ABEF0A40678F 2014-12-05 26201.96
    Y_ADDR SHAPE_Leng SHAPE_Area                       geometry
1 29220.19   5267.381  1630379.3 MULTIPOLYGON (((31495.56 30...
2 29782.05   3506.107   559816.2 MULTIPOLYGON (((29092.28 30...
3 29974.66   1740.926   160807.5 MULTIPOLYGON (((29932.33 29...
4 29933.77   3313.625   595428.9 MULTIPOLYGON (((27131.28 30...
5 30005.70   2825.594   387429.4 MULTIPOLYGON (((26451.03 30...
tail(mpz,n=5)
Simple feature collection with 5 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension:     XY
Bounding box:  xmin: 20329.95 ymin: 42445.28 xmax: 38889.96 ymax: 50256.33
Projected CRS: SVY21
    OBJECTID SUBZONE_NO         SUBZONE_N SUBZONE_C CA_IND PLN_AREA_N
319      319          7      CONEY ISLAND    PGSZ07      N    PUNGGOL
320      320          9       NORTH COAST    WDSZ09      N  WOODLANDS
321      321          6 SEMBAWANG STRAITS    SBSZ06      N  SEMBAWANG
322      322          7       THE WHARVES    SBSZ07      N  SEMBAWANG
323      323          8      SENOKO NORTH    SBSZ08      N  SEMBAWANG
    PLN_AREA_C          REGION_N REGION_C          INC_CRC FMEL_UPD_D   X_ADDR
319         PG NORTH-EAST REGION      NER 8B13A48924BBE015 2014-12-05 37928.50
320         WD      NORTH REGION       NR 898B2436858382A1 2014-12-05 22147.04
321         SB      NORTH REGION       NR AA1A638CA2B0D5B7 2014-12-05 28352.48
322         SB      NORTH REGION       NR 6D89875A351CF51C 2014-12-05 26945.07
323         SB      NORTH REGION       NR A800CBEE879C1BF9 2014-12-05 24665.79
      Y_ADDR SHAPE_Leng SHAPE_Area                       geometry
319 43351.37   5670.137    1200805 MULTIPOLYGON (((38738.41 42...
320 48031.55  10847.882    2450784 MULTIPOLYGON (((21693.06 48...
321 48918.27   7217.388    1540734 MULTIPOLYGON (((29302.17 48...
322 49552.79  11828.878    1635808 MULTIPOLYGON (((26219.89 50...
323 49482.60   7392.129    2241387 MULTIPOLYGON (((26047.11 50...

Aspatial data: read_csv()

popdata <- read_csv("../../data/Week2/aspatial/respopagesextod2011to2020.csv")
popdata
# A tibble: 984,656 × 7
   PA         SZ                     AG     Sex     TOD                Pop  Time
   <chr>      <chr>                  <chr>  <chr>   <chr>            <dbl> <dbl>
 1 Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Males   HDB 1- and 2-Ro…     0  2011
 2 Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Males   HDB 3-Room Flats    10  2011
 3 Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Males   HDB 4-Room Flats    30  2011
 4 Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Males   HDB 5-Room and …    50  2011
 5 Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Males   HUDC Flats (exc…     0  2011
 6 Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Males   Landed Properti…     0  2011
 7 Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Males   Condominiums an…    40  2011
 8 Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Males   Others               0  2011
 9 Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Females HDB 1- and 2-Ro…     0  2011
10 Ang Mo Kio Ang Mo Kio Town Centre 0_to_4 Females HDB 3-Room Flats    10  2011
# ℹ 984,646 more rows

Data Wrangling functions used

Definition of data wrangling: Data wrangling, also known as data munging or data preprocessing, refers to the process of cleaning, structuring, and transforming raw data into a format that is suitable for analysis or other downstream tasks.

Some functions that we will be using this week & a quick dive into them:

  1. pivot_wider()

    • pivot_wider() is used to reshape data from long to wide format.

  2. mutate()

    • mutate() This function is used to add new variables or modify existing ones in a data frame

  3. filter()

    • filter() This function is used to subset data based on conditions.

  4. group_by()

    • group_by() This function is used to group data by one or more variables. It is often used in combination with functions like summarize().

  5. select()

    • select() This function is used to select specific columns from a data frame.

Putting all the function together, an example of data wrangling:

Explanation of the code:

  • Filtering Data for the Year 2020: filter(Time == 2020): This line filters the dataset popdata to include only rows where the column Time has a value of 2020.

  • Grouping and Summarizing Data: group_by(PA, SZ, AG) %>% summarise(POP = sum(Pop)): It groups the filtered data by the columns PA, SZ, and AG, and then calculates the sum of the Pop column for each group, creating a new column named POP with the summarized population.

  • Ungrouping Data: ungroup(): This ungroups the data, removing the grouping structure. It is often used after summarizing grouped data.

  • Pivoting Data from Long to Wide Format: pivot_wider(names_from=AG, values_from=POP): It pivots the data from long to wide format, creating separate columns for each unique value in the AG column (assuming AG represents different age groups).

  • Calculating Additional Variables:

    • mutate(YOUNG = rowSums(.[3:6]) + rowSums(.[12])): It calculates a new variable YOUNG by summing the values in columns 3 to 6 and column 12.

    • mutate(ECONOMY ACTIVE = rowSums(.[7:11]) + rowSums(.[13:15])): It calculates a new variable ECONOMY ACTIVE by summing the values in columns 7 to 11 and columns 13 to 15.

    • mutate(AGED=rowSums(.[16:21])): It calculates a new variable AGED by summing the values in columns 16 to 21.

    • mutate(TOTAL=rowSums(.[3:21])): It calculates a new variable TOTAL by summing the values in columns 3 to 21.

    • mutate(DEPENDENCY = (YOUNG+AGED) / ECONOMY ACTIVE): It calculates a new variable DEPENDENCY by dividing the sum of YOUNG and AGED by ECONOMY ACTIVE.

  • Selecting Specific Columns:

    • select(PA, SZ, YOUNG, ECONOMY ACTIVE, AGED, TOTAL, DEPENDENCY): It selects specific columns (PA, SZ, YOUNG, ECONOMY ACTIVE, AGED, TOTAL, DEPENDENCY) from the dataset to include in the final popdata2020 dataset.
popdata2020 <- popdata %>%
  filter(Time == 2020) %>%
  group_by(PA, SZ, AG) %>%
  summarise(`POP` = sum(`Pop`)) %>%
  ungroup()%>%
  pivot_wider(names_from=AG, 
              values_from=POP) %>%
  mutate(YOUNG = rowSums(.[3:6])
         +rowSums(.[12])) %>%
mutate(`ECONOMY ACTIVE` = rowSums(.[7:11])+
rowSums(.[13:15]))%>%
mutate(`AGED`=rowSums(.[16:21])) %>%
mutate(`TOTAL`=rowSums(.[3:21])) %>%  
mutate(`DEPENDENCY` = (`YOUNG` + `AGED`)
/`ECONOMY ACTIVE`) %>%
  select(`PA`, `SZ`, `YOUNG`, 
       `ECONOMY ACTIVE`, `AGED`, 
       `TOTAL`, `DEPENDENCY`)

popdata2020
# A tibble: 332 × 7
   PA         SZ                   YOUNG `ECONOMY ACTIVE`  AGED TOTAL DEPENDENCY
   <chr>      <chr>                <dbl>            <dbl> <dbl> <dbl>      <dbl>
 1 Ang Mo Kio Ang Mo Kio Town Cen…  1440             2610   760  4810      0.843
 2 Ang Mo Kio Cheng San             6640            15460  6050 28150      0.821
 3 Ang Mo Kio Chong Boon            6150            13950  6470 26570      0.905
 4 Ang Mo Kio Kebun Bahru           5540            12090  5120 22750      0.882
 5 Ang Mo Kio Sembawang Hills       2100             3410  1310  6820      1    
 6 Ang Mo Kio Shangri-La            3960             8420  3610 15990      0.899
 7 Ang Mo Kio Tagore                2220             4200  1530  7950      0.893
 8 Ang Mo Kio Townsville            4690            11450  5100 21240      0.855
 9 Ang Mo Kio Yio Chu Kang             0                0     0     0    NaN    
10 Ang Mo Kio Yio Chu Kang East     1220             2300   750  4270      0.857
# ℹ 322 more rows

Joining the attribute data and geospatial data

Things to note:

  • Changing values to uppercase is a common practice in data preparation when you anticipate performing string-based operations, comparisons, or joins. In the context of georelational joins or any join operation, ensuring consistent case (uppercase or lowercase) in the joining fields is crucial for accurate and successful matches
popdata2020 <- popdata2020 %>%
  mutate_at(.vars = vars(PA, SZ), 
          .funs = list(toupper)) %>%
  filter(`ECONOMY ACTIVE` > 0)

Then we will combine them together with left join

  • Using a left join in this context implies that you want to retain all the rows from the left data frame (mpsz) in the result, regardless of whether there is a matching row in the right data frame (popdata2020). If there is a match, the corresponding values from the right data frame will be included; otherwise, the columns from the right data frame will have NA values.
mpsz_pop2020 <- left_join(mpz, popdata2020,
                          by = c("SUBZONE_N" = "SZ"))

Saving R object into rds file

write_rds(): This is a function in R that is part of the readr package. It is used to write an R object to an RDS file.

write_rds(mpsz_pop2020, "../../data/Week2/rds/mpszpop2020.rds")

Chloropleth mapping

qtm()

Explanation of code:

  • qtm(): This function stands for “Quick thematic map.” It is a simple way to create a choropleth map using tmap.
  • tmap_mode(): There are a few modes
    • view: Interactive mode
    • plot: static viewing mode
    • window: Graph will be opened on another window
  • mpsz_pop2020: This is the data frame used for creating the map.
  • fill = "DEPENDENCY": This specifies the variable to use for filling the map polygons. In this case, the color of each polygon will be determined by the values in the “DEPENDENCY” column of the mpsz_pop2020 data frame.
tmap_mode("plot")
qtm(mpsz_pop2020, 
    fill = "DEPENDENCY")

To make it on view mode

  • tmap_options(check.and.fix = TRUE) before creating the map to enable automatic checking and fixing of the shapefile:
tmap_options(check.and.fix = TRUE)
tmap_mode("view")
qtm(mpsz_pop2020, 
    fill = "DEPENDENCY")

Chloropleth map with map elements

Explanation of the code:

  • tm_shape(mpsz_pop2020): tm_shape() is a function that defines the geographical data to be used for mapping. Here, it specifies the mpsz_pop2020 data frame as the source for the map.

  • tm_fill("DEPENDENCY", style = "quantile", palette = "Blues", title = "Dependency ratio"): tm_fill() is used to specify how the polygons should be filled based on a specific variable.

    • In this case, it fills the polygons based on the “DEPENDENCY” column of mpsz_pop2020.

    • style = “quantile” specifies the coloring style as quantile.

    • palette = “Blues” defines the color palette to be used (Blues).

    • title = “Dependency ratio” sets the legend title.

  • tm_layout(...): tm_layout() is used to customize the layout and appearance of the map.

    • main.title specifies the main title of the map. main.title.position sets the position of the main title (“center” in this case).

    • main.title.size adjusts the font size of the main title. legend.height and legend.width control the size of the legend. frame = TRUE adds a frame around the map.

  • tm_borders(alpha = 0.5): tm_borders() adds borders to the map, and alpha = 0.5 sets the transparency level of the borders.

  • tm_compass(type = "8star", size = 2): tm_compass() adds a compass to the map. type = “8star” specifies the type of compass (8-point star). size = 2 sets the size of the compass.

  • tm_scale_bar(): tm_scale_bar() adds a scale bar to the map.

  • tm_grid(alpha = 0.2): tm_grid() adds a grid to the map. alpha = 0.2 sets the transparency level of the grid.

  • tm_credits(...): tm_credits() adds credits or a data source attribution to the map. The text provided in the argument specifies the source of the masp data.

tm_shape(mpsz_pop2020)+
  tm_fill("DEPENDENCY", 
          style = "quantile", 
          palette = "Blues",
          title = "Dependency ratio") +
  tm_layout(main.title = "Distribution of Dependency Ratio by planning subzone",
            main.title.position = "center",
            main.title.size = 1.2,
            legend.height = 0.45, 
            legend.width = 0.35,
            frame = TRUE) +
  tm_borders(alpha = 0.5) +
  tm_compass(type="8star", size = 2) +
  tm_scale_bar() +
  tm_grid(alpha =0.2) +
  tm_credits("Source: Planning Sub-zone boundary from Urban Redevelopment Authorithy (URA)\n and Population data from Department of Statistics DOS", 
             position = c("left", "bottom"))

When to use tm_polygon() and when to use tm_fill(): - tm_polygons() is a wraper of tm_fill() and tm_border(). tm_fill() shades the polygons by using the default colour scheme and tm_borders() adds the borders of the shapefile onto the choropleth map.

  • Use tm_polygons() When:

    • You want to customize general polygon properties, such as borders, labels, or text, without focusing on fill color based on a variable.
  • Use tm_fill() When:

    • You specifically want to customize how the interior of the polygons should be colored based on a variable. It is useful for creating choropleth maps where color represents the values of a particular variable.
tm_shape(mpsz_pop2020) +
  tm_polygons("DEPENDENCY")
tm_shape(mpsz_pop2020) +
  tm_fill("DEPENDENCY")

Viewing how tm_fill() and tm_borders() works:

  • lwd sets the line width of the borders. The default is 1.
  • alpha sets the transparency level of the borders
  • col sets the border colour
  • lty sets the border line type. The default is “solid”.
tm_shape(mpsz_pop2020) +
  tm_fill("DEPENDENCY") +
  tm_borders(lwd=1.4, alpha=0.2) +
  tmap_mode("plot")

Customising the classes

Important to note:

  • Usually we will ensure to compute and display the descriptive statistics
summary(mpsz_pop2020$DEPENDENCY)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
 0.1111  0.7147  0.7866  0.8585  0.8763 19.0000      92 

In R, the $ symbol is used for extracting a specific variable from a data frame. It is called the “dollar sign operator” and is used to access columns or variables within a data frame.

tm_shape(mpsz_pop2020)+
  tm_fill("DEPENDENCY",
          breaks = c(0, 0.60, 0.70, 0.80, 0.90, 1.00)) +
  tm_borders(alpha=0.5)

Map legend

Explanation of the code:

  • tm_fill("DEPENDENCY", style = "jenks", palette = "Blues" legend.hist = TRUE, legend.is.portrait = TRUE, legend.hist.z = 0.1):
    • tm_fill() is used to fill the polygons on the map based on a specific variable (“DEPENDENCY”).
    • style = "jenks" sets the classification method to Jenks natural breaks.
    • palette = "Blues" defines the color palette to be used.
    • legend.hist = TRUE adds a histogram to the legend.
    • legend.is.portrait = TRUE arranges the legend in portrait mode.
    • legend.hist.z = 0.1 sets the transparency of the histogram in the legend.
  • tm_layout(main.title = ..., main.title.position = ..., main.title.size = ..., legend.height = ..., legend.width = ..., legend.outside = ..., legend.position = ..., frame = ...):
    • tm_layout() is used to customize the layout and appearance of the map.
    • main.title sets the main title of the map.
    • main.title.position sets the position of the main title.
    • main.title.size adjusts the font size of the main title.
    • legend.height and legend.width control the size of the legend.
    • legend.outside determines whether the legend is placed outside the map.
    • legend.position sets the position of the legend.
    • frame controls whether a frame is added around the map.
tm_shape(mpsz_pop2020)+
  tm_fill("DEPENDENCY", 
          style = "jenks", 
          palette = "Blues", 
          legend.hist = TRUE, 
          legend.is.portrait = TRUE,
          legend.hist.z = 0.1) +
  tm_layout(main.title = "Distribution of Dependency Ratio by planning subzone \n(Jenks classification)",
            main.title.position = "center",
            main.title.size = 1,
            legend.height = 0.45, 
            legend.width = 0.35,
            legend.outside = FALSE,
            legend.position = c("right", "bottom"),
            frame = FALSE) +
  tm_borders(alpha = 0.5)

Drawing Small Multiple Choropleth Maps

The example below shows multiple choropleth maps via aesthetic arguments

tm_shape(mpsz_pop2020)+
  tm_fill(c("YOUNG", "AGED"),
          style = "equal", 
          palette = "Blues") +
  tm_layout(legend.position = c("right", "bottom")) +
  tm_borders(alpha = 0.5) +
  tmap_style("white")

The example below shows mutliple choropleth maps via tm_facets()

Explanation of the code below:

  • thres.poly = 0 sets a threshold for polygon size, indicating that polygons with an area below this threshold should not be filled.
  • free.coords = TRUE allows the facets to have different coordinate systems. When you set free.coords = TRUE in the tm_facets() function (as in your code snippet), it allows facets to have different coordinate systems. However, this is generally not recommended unless you have a specific reason for doing so. Using different coordinate systems can distort the spatial representation, making it challenging for viewers to accurately compare the facets.
  • drop.shapes = TRUE drops facets with no data.
tm_shape(mpsz_pop2020) +
  tm_fill("DEPENDENCY",
          style = "quantile",
          palette = "Blues",
          thres.poly = 0) + 
  tm_facets(by="REGION_N", 
            free.coords=TRUE, 
            drop.shapes=TRUE) +
  tm_layout(legend.show = FALSE,
            title.position = c("center", "center"), 
            title.size = 20) +
  tm_borders(alpha = 0.5)

This is an example of creating multiple choropleths map via tmap_arrange()

Explanation of the code:

  • asp = 1 ensures that the aspect ratio of the individual maps is preserved.
  • ncol = 2 specifies that the arrangement should have 2 columns, meaning the maps will be displayed side by side.
youngmap <- tm_shape(mpsz_pop2020)+ 
  tm_polygons("YOUNG", 
              style = "quantile", 
              palette = "Blues")

agedmap <- tm_shape(mpsz_pop2020)+ 
  tm_polygons("AGED", 
              style = "quantile", 
              palette = "Blues")

tmap_arrange(youngmap, agedmap, asp=1, ncol=2)

tmap_arrange(youngmap, agedmap, asp = 1, nrow = 2)